335 research outputs found
A critical analysis of self-supervision, or what we can learn from a single image
We look critically at popular self-supervision techniques for learning deep
convolutional neural networks without manual labels. We show that three
different and representative methods, BiGAN, RotNet and DeepCluster, can learn
the first few layers of a convolutional network from a single image as well as
using millions of images and manual labels, provided that strong data
augmentation is used. However, for deeper layers the gap with manual
supervision cannot be closed even if millions of unlabelled images are used for
training. We conclude that: (1) the weights of the early layers of deep
networks contain limited information about the statistics of natural images,
that (2) such low-level statistics can be learned through self-supervision just
as well as through strong supervision, and that (3) the low-level statistics
can be captured via synthetic transformations instead of using a large image
dataset.Comment: Accepted paper at the International Conference on Learning
Representations (ICLR) 202
A Study on the Relationship between Children's Developmental Stages and Sense of Color
It is well known that human sensitivity to color and expressive ability varies with age and gender. In addition, the perception, understanding, and comprehension of color vary according to developmental stage and color-related experiences. This study is one approach to research to clarify the relationship between such "sense of color" as above and the developmental stages of children.
In this study, the coloring behavior of teenage subjects; elementary school, junior high school, and university students, to coloring book images were investigated using iPads. The characteristics of coloring and color schemes used in the coloring books were analyzed to explore the relationship with the developmental stages of the children. The coloring book images, mandala-like patterns, used in the investigation were designed originally based on some preliminary investigations. In addition, the original palette of colors systematically arranged in hues and tones was specified to quantitatively analyze the characteristics of the colors used in the coloring book.
The results showed that the hues of colors used with high frequency in coloring books changed as the developmental stage progressed and that the range of tones by the combination of saturation and lightness widened. It was also found that the color schemes were simple and easy to understand at younger ages, while the complexity of the color schemes increased as the children grew older
Measuring the Interpretability of Unsupervised Representations via Quantized Reverse Probing
Self-supervised visual representation learning has recently attracted
significant research interest. While a common way to evaluate self-supervised
representations is through transfer to various downstream tasks, we instead
investigate the problem of measuring their interpretability, i.e. understanding
the semantics encoded in raw representations. We formulate the latter as
estimating the mutual information between the representation and a space of
manually labelled concepts. To quantify this we introduce a decoding
bottleneck: information must be captured by simple predictors, mapping concepts
to clusters in representation space. This approach, which we call reverse
linear probing, provides a single number sensitive to the semanticity of the
representation. This measure is also able to detect when the representation
contains combinations of concepts (e.g., "red apple") instead of just
individual attributes ("red" and "apple" independently). Finally, we propose to
use supervised classifiers to automatically label large datasets in order to
enrich the space of concepts used for probing. We use our method to evaluate a
large number of self-supervised representations, ranking them by
interpretability, highlight the differences that emerge compared to the
standard evaluation with linear probes and discuss several qualitative
insights. Code at: {\scriptsize{\url{https://github.com/iro-cp/ssl-qrp}}}.Comment: Published at ICLR 2022. Appendix included, 26 page
Self-labelling via simultaneous clustering and representation learning
Combining clustering and representation learning is one of the most promising
approaches for unsupervised learning of deep neural networks. However, doing so
naively leads to ill posed learning problems with degenerate solutions. In this
paper, we propose a novel and principled learning formulation that addresses
these issues. The method is obtained by maximizing the information between
labels and input data indices. We show that this criterion extends standard
crossentropy minimization to an optimal transport problem, which we solve
efficiently for millions of input images and thousands of labels using a fast
variant of the Sinkhorn-Knopp algorithm. The resulting method is able to
self-label visual data so as to train highly competitive image representations
without manual labels. Our method achieves state of the art representation
learning performance for AlexNet and ResNet-50 on SVHN, CIFAR-10, CIFAR-100 and
ImageNet and yields the first self-supervised AlexNet that outperforms the
supervised Pascal VOC detection baseline. Code and models are available.Comment: Accepted paper at the International Conference on Learning
Representations (ICLR) 202
Semantic Counting from Self-Collages
While recent supervised methods for reference-based object counting continue
to improve the performance on benchmark datasets, they have to rely on small
datasets due to the cost associated with manually annotating dozens of objects
in images. We propose Unsupervised Counter (UnCo), a model that can learn this
task without requiring any manual annotations. To this end, we construct
"SelfCollages", images with various pasted objects as training samples, that
provide a rich learning signal covering arbitrary object types and counts. Our
method builds on existing unsupervised representations and segmentation
techniques to successfully demonstrate the ability to count objects without
manual supervision. Our experiments show that our method not only outperforms
simple baselines and generic models such as FasterRCNN, but also matches the
performance of supervised counting models in some domains.Comment: 24 pages. Code available at
https://github.com/lukasknobel/SelfCollage
Nuclear spin relaxation rate of nonunitary Dirac and Weyl superconductors
Nonunitary superconductivity has attracted renewed interest as a novel
gapless phase of matter. In this study, we investigate the superconducting gap
structure of nonunitary odd-parity chiral pairing states in a superconductor
involving strong spin-orbit interactions. By applying a group theoretical
classification of chiral states in terms of discrete rotation symmetry, we
categorized all possible point-nodal gap structures in nonunitary chiral states
into four types in terms of the topological number of nodes and node positions
relative to the rotation axis. In addition to conventional Dirac and Weyl point
nodes, we identify a novel type of Dirac point node unique to nonunitary chiral
superconducting states. The node type can be identified experimentally based on
the temperature dependence of the nuclear magnetic resonance longitudinal
relaxation rate. The implication of our results for a nonunitary odd-parity
superconductor in UTe is also discussed.Comment: 18 pages, 4 figure
Labelling unlabelled videos from scratch with multi-modal self-supervision
A large part of the current success of deep learning lies in the
effectiveness of data -- more precisely: labelled data. Yet, labelling a
dataset with human annotation continues to carry high costs, especially for
videos. While in the image domain, recent methods have allowed to generate
meaningful (pseudo-) labels for unlabelled datasets without supervision, this
development is missing for the video domain where learning feature
representations is the current focus. In this work, we a) show that
unsupervised labelling of a video dataset does not come for free from strong
feature encoders and b) propose a novel clustering method that allows
pseudo-labelling of a video dataset without any human annotations, by
leveraging the natural correspondence between the audio and visual modalities.
An extensive analysis shows that the resulting clusters have high semantic
overlap to ground truth human labels. We further introduce the first
benchmarking results on unsupervised labelling of common video datasets
Kinetics, Kinetics-Sound, VGG-Sound and AVE.Comment: Accepted to NeurIPS 2020. Project page:
https://www.robots.ox.ac.uk/~vgg/research/selavi, code:
https://github.com/facebookresearch/selav
Self-Ordering Point Clouds
In this paper we address the task of finding representative subsets of points
in a 3D point cloud by means of a point-wise ordering. Only a few works have
tried to address this challenging vision problem, all with the help of hard to
obtain point and cloud labels. Different from these works, we introduce the
task of point-wise ordering in 3D point clouds through self-supervision, which
we call self-ordering. We further contribute the first end-to-end trainable
network that learns a point-wise ordering in a self-supervised fashion. It
utilizes a novel differentiable point scoring-sorting strategy and it
constructs an hierarchical contrastive scheme to obtain self-supervision
signals. We extensively ablate the method and show its scalability and superior
performance even compared to supervised ordering methods on multiple datasets
and tasks including zero-shot ordering of point clouds from unseen categories
- …